Subset Quantile Normalization Using Negative Control Features
نویسندگان
چکیده
Normalization has been recognized as a necessary preprocessing step in a variety of high-throughput biotechnologies. A number of normalization methods have been developed specifically for microarrays, some general and others tailored for certain experimental designs. All methods rely on assumptions about data characteristics that are expected to stay constant across samples, although some make it more explicit than others. Most methods make assumptions that certain quantities related to the biological signal of interest stay the same; this is reasonable for many experiments but usually not verifiable. Recently, several platforms have begun to include a large number of negative control probes that nonetheless cover nearly the entire range of the measured signal intensity. Using these probes as a normalization basis makes it possible to normalize without making assumptions about the behavior of the biological signal. We present a subset quantile normalization (SQN) procedure that normalizes based on the distribution of non-specific control features, without restriction on the behavior of specific signals. We illustrate the performance of this method using three different platforms and experimental settings. Compared to two other leading nonlinear normalization procedures, the SQN method preserves more biological variation after normalization while reducing the noise observed on control features. Although the illustration datasets are from microarray experiments, this method is general for all high throughput technologies that include a large set of control features that have constant expectations across samples. It does not require an equal number of features in all samples and tolerates missing data.
منابع مشابه
An Empirical Evaluation of Normalization Methods for MicroRNA Arrays in a Liposarcoma Study
BACKGROUND Methods for array normalization, such as median and quantile normalization, were developed for mRNA expression arrays. These methods assume few or symmetric differential expression of genes on the array. However, these assumptions are not necessarily appropriate for microRNA expression arrays because they consist of only a few hundred genes and a reasonable fraction of them are antic...
متن کاملRemoving technical variability in RNA-seq data using conditional quantile normalization
The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RN...
متن کاملImpact of normalization on miRNA microarray expression profiling.
Profiling miRNA levels in cells with miRNA microarrays is becoming a widely used technique. Although normalization methods for mRNA gene expression arrays are well established, miRNA array normalization has so far not been investigated in detail. In this study we investigate the impact of normalization on data generated with the Agilent miRNA array platform. We have developed a method to select...
متن کاملRemoving Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data
Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expr...
متن کاملFaster cyclic loess: normalizing RNA arrays via linear models
MOTIVATION Our goal was to develop a normalization technique that yields results similar to cyclic loess normalization and with speed comparable to quantile normalization. RESULTS Fastlo yields normalized values similar to cyclic loess and quantile normalization and is fast; it is at least an order of magnitude faster than cyclic loess and approaches the speed of quantile normalization. Furth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 17 10 شماره
صفحات -
تاریخ انتشار 2010